A Mono-lingual Corpus-Based Machine Translation of the Interlingua Method
نویسندگان
چکیده
This paper describes a prototype of an example-based machine translation system. In this system, key language resources are EDR corpus and concept classification dictionary. The corpus consists of a pair of sentences, their morphological representations, their syntactic representations, and their semantic representations. The semantic representations are described by an interlingua. Therefore the corpus can be viewed either as a mono-lingual corpus or as a parallel corpus between a natural language and an interlingua. The system analyses source sentences and generates target sentences by example databases. Similarity calculations play essential roles in analysis and generation phases. These calculations uses the concept classification dictionary. The translation system is realized by directly combining a source language analysis and a target language generation without a transfer phase. The system has been implemented and the state of the current prototype showed evaluation data which suggested the corpus-based MT approach would be good prospects.
منابع مشابه
English-Persian Plagiarism Detection based on a Semantic Approach
Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...
متن کاملMapping Verbs in Different Languages to Knowledge Base Relations using Web Text as Interlingua
In recent years many knowledge bases (KBs) have been constructed, yet there is not yet a verb resource that maps to these growing KB resources. A resource that maps verbs in different languages to KB relations would be useful for extracting facts from text into the KBs, and to aid alignment and integration of knowledge across different KBs and languages. Such a multi-lingual verb resource would...
متن کاملMulti-Lingual Phrase-Based Statistical Machine Translation for Arabic-English
In this paper, we implement a multilingual Statistical Machine Translation (SMT) system for Arabic-English Translation. Arabic Text can be categorized into standard and dialectal Arabic. These two forms of Arabic differ significantly. Different mono-lingual and multi-lingual hybrid SMT approaches are compared. Mono-lingual systems do always result in better translation accuracy in one Arabic fo...
متن کاملAutomated Corpus Analysis and the Acquisition of Large, Multi-Lingual Knowledge Bases for MT1
Although knowledge-based MT systems have the potential to achieve high translation accuracy, each successful application system requires a large amount of hand-coded knowledge (lexicons, grammars, mapping rules, etc.). Systems like KBMT-89 and its descendents have demonstrated how knowledge-based translation can produce good results in technical domains with tractable domain semantics. Neverthe...
متن کاملAutomated Corpus Analysis and the Acquisition of Large, Multi-Lingual Knowledge Bases for MT
Although knowledge-based MT systems have the potential to achieve high translation accuracy, each successful application system requires a large amount of hand-coded knowledge (lexicons, grammars, mapping rules, etc.). Systems like KBMT-89 and its descendants have demonstrated how knowledge-based translation can produce good results in technical domains with tractable domain semantics. Neverthe...
متن کامل